Introducing MBIB -- the first Media Bias Identification Benchmark Task and Dataset Collection

Wessel, Martin; Horych, Tomáš; Ruas, Terry; Aizawa, Akiko; Gipp, Bela; Spinde, Timo

doi:10.1145/3539618.3591882

Computer Science > Information Retrieval

arXiv:2304.13148 (cs)

[Submitted on 25 Apr 2023]

Title:Introducing MBIB -- the first Media Bias Identification Benchmark Task and Dataset Collection

Authors:Martin Wessel, Tomáš Horych, Terry Ruas, Akiko Aizawa, Bela Gipp, Timo Spinde

View PDF

Abstract:Although media bias detection is a complex multi-task problem, there is, to date, no unified benchmark grouping these evaluation tasks. We introduce the Media Bias Identification Benchmark (MBIB), a comprehensive benchmark that groups different types of media bias (e.g., linguistic, cognitive, political) under a common framework to test how prospective detection techniques generalize. After reviewing 115 datasets, we select nine tasks and carefully propose 22 associated datasets for evaluating media bias detection techniques. We evaluate MBIB using state-of-the-art Transformer techniques (e.g., T5, BART). Our results suggest that while hate speech, racial bias, and gender bias are easier to detect, models struggle to handle certain bias types, e.g., cognitive and political bias. However, our results show that no single technique can outperform all the others significantly. We also find an uneven distribution of research interest and resource allocation to the individual tasks in media bias. A unified benchmark encourages the development of more robust systems and shifts the current paradigm in media bias detection evaluation towards solutions that tackle not one but multiple media bias types simultaneously.

Comments:	To be published in Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '23)
Subjects:	Information Retrieval (cs.IR); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2304.13148 [cs.IR]
	(or arXiv:2304.13148v1 [cs.IR] for this version)
	https://doi.org/10.48550/arXiv.2304.13148
Related DOI:	https://doi.org/10.1145/3539618.3591882

Submission history

From: Martin Wessel [view email]
[v1] Tue, 25 Apr 2023 20:49:55 UTC (302 KB)

Computer Science > Information Retrieval

Title:Introducing MBIB -- the first Media Bias Identification Benchmark Task and Dataset Collection

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Information Retrieval

Title:Introducing MBIB -- the first Media Bias Identification Benchmark Task and Dataset Collection

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators